Taking advantage of Wikipedia in Natural Language Processing

نویسندگان

  • Tae Yano
  • Moonyoung Kang
چکیده

Wikipedia is an online encyclopedia created on the web by various participants. Although it is not created for the purpose of helping studies in language processing, its size and well-formed structure is attracting many researchers in the area. In this review, we selected five characteristic papers to show various creative uses of Wikipedia within the three years.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

WRPA: A System for Relational Paraphrase Acquisition from Wikipedia

In this paper we present WRPA, a system for Relational Paraphrase Acquisition from Wikipedia. WRPA extracts paraphrasing patterns that hold a particular relation between two entities taking advantage of Wikipedia structure. What is new in this system is that Wikipedia’s exploitation goes beyond infoboxes, reaching itemized information embedded in Wikipedia pages. WRPA is language independent, a...

متن کامل

Disentangling the Wikipedia Category Graph for Corpus Extraction

In several areas of research such as knowledge management and natural language processing, domain-specific corpora are required for tasks such as terminology extraction and ontology learning. The presented investigations herein are based on the assumption that Wikipedia can be used for the purpose of corpus extraction. It presents the advantage of possessing a semantic layer, which should ease ...

متن کامل

WRPA: A System for Relational Paraphrase Acquisition from Wikipedia∗ WRPA: Un sistema para la adquisición de paráfrasis de relaciones de la Wikipedia

In this paper we present WRPA, a system for Relational Paraphrase Acquisition from Wikipedia. WRPA extracts paraphrasing patterns that hold a particular relation between two entities taking advantage of Wikipedia structure. What is new in this system is that Wikipedia’s exploitation goes beyond infoboxes, reaching itemized information embedded in Wikipedia pages. WRPA is language independent, a...

متن کامل

Advertising Keyword Suggestion Using Relevance-Based Language Models from Wikipedia Rich Articles

When emerging technologies such as Search Engine Marketing (SEM) face tasks that require human level intelligence, it is inevitable to use the knowledge repositories to endow the machine with the breadth of knowledge available to humans. Keyword suggestion for search engine advertising is an important problem for sponsored search and SEM that requires a goldmine repository of knowledge. A recen...

متن کامل

Discriminative Learning with Natural Annotations: Word Segmentation as a Case Study

Structural information in web text provides natural annotations for NLP problems such as word segmentation and parsing. In this paper we propose a discriminative learning algorithm to take advantage of the linguistic knowledge in large amounts of natural annotations on the Internet. It utilizes the Internet as an external corpus with massive (although slight and sparse) natural annotations, and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008